
 <!DOCTYPE HTML>
<html>
<head><meta name="generator" content="Hexo 3.9.0">
  <meta charset="UTF-8">
  
    <title>Nginx accesslog分离手机型号信息 | Zong&#39;s blog</title>
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=3, minimum-scale=1">
    
    <meta name="author" content="Zong">
    
    <meta name="description" content="最近在玩elk，虽然没有自己搭建环境，但在配置收集log方面也有一些小心得值得记录一下。今天先记录一下这个，就是把access log里面http_user_agent字段中关于手机型号的信息分离出来，新增一个字段。首先，为了把nginx的access log导入elk，已经对其格式进行了json化">
    
    
    
    
    
    <link rel="icon" href="/img/favicon.ico">
    
    
    <link rel="apple-touch-icon" href="/img/pacman.jpg">
    <link rel="apple-touch-icon-precomposed" href="/img/pacman.jpg">
    
    <link rel="stylesheet" href="/css/style.css">
</head>
</html>
  <body>
    <header>
      <div>
		
			<div id="imglogo">
				<a href="/"><img src="/img/logo.svg" alt="Zong&#39;s blog" title="Zong&#39;s blog"/></a>
			</div>
			
			<div id="textlogo">
				<h1 class="site-name"><a href="/" title="Zong&#39;s blog">Zong&#39;s blog</a></h1>
				<h2 class="blog-motto">日常积累，技术分享</h2>
			</div>
			<div class="navbar"><a class="navbutton navmobile" href="#" title="Menu">
			</a></div>
			<nav class="animated">
				<ul>
					<ul>
					 
						<li><a href="/">Home</a></li>
					
						<li><a href="/archives">Archives</a></li>
					
						<li><a href="/categories/运维">运维</a></li>
					
						<li><a href="/categories/容器架构">容器架构</a></li>
					
					<li>
					
					<form class="search" action="//baidu.com/s" method="get" accept-charset="utf-8">
						<label>Search</label>
						<input type="text" id="search" name="wd" autocomplete="off" maxlength="20" placeholder="Search" />
                        <input name=tn type=hidden value="bds">
                        <input name=cl type=hidden value="3">
                        <input name=ct type=hidden value="2097152">
						<input type="hidden" name="si" value="www.lstop.pub">
					</form>
					
					</li>
				</ul>
			</nav>			
</div>

    </header>
    <div id="container">
      <div id="main" class="post" itemscope itemprop="blogPost">
	<article itemprop="articleBody"> 
		<header class="article-info clearfix">
  <h1 itemprop="name">
    
      <a href="/2016/08/15/Nginx-accesslog分离手机型号信息/" title="Nginx accesslog分离手机型号信息" itemprop="url">Nginx accesslog分离手机型号信息</a>
  </h1>
  <p class="article-author">By
    
      <a href="http://www.lstop.pub" title="Zong">Zong</a>
    </p>
  <p class="article-time">
    <time datetime="2016-08-15T01:26:52.000Z" itemprop="datePublished">2016-08-15</time>
    
  </p>
</header>

	<div class="article-content">
		
		
		<div id="toc" class="toc-article">
			<strong class="toc-title">Contents</strong>
		
		</div>
		
		<p>最近在玩elk，虽然没有自己搭建环境，但在配置收集log方面也有一些小心得值得记录一下。<br>今天先记录一下这个，就是把access log里面http_user_agent字段中关于手机型号的信息分离出来，新增一个字段。<br>首先，为了把nginx的access log导入elk，已经对其格式进行了json化：</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">&#123;</span><br><span class="line">    &quot;@timestamp&quot;: &quot;2016-08-15T09:58:54+08:00&quot;,</span><br><span class="line">    &quot;remote_addr&quot;: &quot;117.25.139.53&quot;,</span><br><span class="line">    &quot;remote_user&quot;: &quot;-&quot;,</span><br><span class="line">    &quot;body_bytes_sent&quot;: 87,</span><br><span class="line">    &quot;request_time&quot;: 0.089,</span><br><span class="line">    &quot;status&quot;: &quot;200&quot;,</span><br><span class="line">    &quot;request&quot;: &quot;POST /index.html&quot;,</span><br><span class="line">    &quot;request_method&quot;: &quot;POST&quot;,</span><br><span class="line">    &quot;http_referrer&quot;: &quot;http://www.lstop.pub/index.html&quot;,</span><br><span class="line">    &quot;http_x_forwarded_for&quot;: &quot;125.93.83.94&quot;,</span><br><span class="line">    &quot;http_x_real_ip&quot;: &quot;125.93.83.94&quot;,</span><br><span class="line">    &quot;bytes_sent&quot;: 256,</span><br><span class="line">    &quot;upstream_response_time&quot;: 0.089,</span><br><span class="line">    &quot;http_user_agent&quot;: &quot;Mozilla/5.0 (Linux; Android 5.1; OPPO R9tm Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/37.0.0.0 Mobile MQQBrowser/6.2 TBS/036555 Safari/537.36 MicroMessenger/6.3.23.840 NetType/WIFI Language/zh_CN&quot;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>

<p>可以看到，如果访问的客户端是手机的话，user agent是这么一大串</p>
<blockquote>
<p>“http_user_agent”: “Mozilla/5.0 (Linux; Android 5.1; OPPO R9tm Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/37.0.0.0 Mobile MQQBrowser/6.2 TBS/036555 Safari/537.36 MicroMessenger/6.3.23.840 NetType/WIFI Language/zh_CN”</p>
</blockquote>
<p>我们要的是这些</p>
<blockquote>
<p>Android 5.1; OPPO R9tm Build/LMY47I</p>
</blockquote>
<p>总体思路是用sed，截取需要的信息，在json文档添加一个新字段phoneType</p>
<p>截取信息正则表达式当然是首选，一开始写了这个单纯的正则式<br>Android或iPhone OS开头，到第一个”)”，不包括”)”，测试完美匹配。</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(Android|iPhone OS).+?(?=\))</span><br></pre></td></tr></table></figure>

<p>坑爹的事开始来了，sed不支持懒惰匹配。。那零宽断言也没用了。。<br>好吧，变成简单的，这个残缺美。。</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">(Android|iPhone OS).+AppleWebKit</span><br></pre></td></tr></table></figure>

<p>丑了点，关键能用，匹配出来了</p>
<blockquote>
<p>Android 5.1; OPPO R9tm Build/LMY47I) AppleWebKit</p>
</blockquote>
<p>好，坑爹之二，圆括号太多，正则式的圆括号在sed命令死活写不正确，干脆分两条了<br>进一步丑</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Android.+AppleWebKit</span><br><span class="line">iPhone.+AppleWebKit</span><br></pre></td></tr></table></figure>

<p>经过两个坑后，可以进行下一步<br>因为是新增一个phoneType字段，就要重新组装access log。<br>使用sed的圆括号特性:</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">&quot;s/(^.*)(regexp)(.*)(\&#125;$)/\1\2\3\2\4/g&quot;</span><br></pre></td></tr></table></figure>

<p>最终命令</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">tail -F /var/log/nginx/access_www_weixiao100.log|sed -r -e &quot;s/: -/: 0/g&quot; -e &quot;s/(^.*)(Android.+ AppleWebKit)(.*)(\&#125;$)/\1\2\3,\&quot;phoneType\&quot;:\&quot;\2\&quot;\4/g&quot; -e &quot;s/(^.*)(iPhone.+ AppleWebKit)(.*)(\&#125;$)/\1\2\3,\&quot;phoneType\&quot;:\&quot;\2\&quot;\4/g&quot; | sudo filebeat.sh -c /etc/filebeat/filebeat-nginx.yml  &amp;</span><br></pre></td></tr></table></figure>

<p>结果长这样，可以很好的导入elk，对手机型号进行一些分析统计了！</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">&#123;</span><br><span class="line">    &quot;@timestamp&quot;: &quot;2016-08-15T10:23:53+08:00&quot;,</span><br><span class="line">    &quot;remote_addr&quot;: &quot;117.25.139.52&quot;,</span><br><span class="line">    &quot;remote_user&quot;: &quot;-&quot;,</span><br><span class="line">    &quot;body_bytes_sent&quot;: 6550,</span><br><span class="line">    &quot;request_time&quot;: 0.004,</span><br><span class="line">    &quot;status&quot;: &quot;200&quot;,</span><br><span class="line">    &quot;request&quot;: &quot;GET /index.html HTTP/1.1&quot;,</span><br><span class="line">    &quot;request_method&quot;: &quot;GET&quot;,</span><br><span class="line">    &quot;http_referrer&quot;: &quot;http://www.lstop.pub/index.html&quot;,</span><br><span class="line">    &quot;http_x_forwarded_for&quot;: &quot;117.136.53.193&quot;,</span><br><span class="line">    &quot;http_x_real_ip&quot;: &quot;117.136.53.193&quot;,</span><br><span class="line">    &quot;bytes_sent&quot;: 6817,</span><br><span class="line">    &quot;upstream_response_time&quot;: 0.004,</span><br><span class="line">    &quot;http_user_agent&quot;: &quot;Mozilla/5.0 (Linux; Android 4.2.2; 2014011 Build/HM2014011) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/37.0.0.0 Mobile MQQBrowser/6.2 TBS/036519 Safari/537.36 MicroMessenger/6.3.22.821 NetType/cmnet Language/zh_CN&quot;,</span><br><span class="line">    &quot;phoneType&quot;: &quot;Android 4.2.2; 2014011 Build/HM2014011) AppleWebKit&quot;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>  
	</div>
		<footer class="article-footer clearfix">

  <div class="article-tags">
  
  <span></span> <a href="/tags/elk/">elk</a>
  </div>


<div class="article-categories">
  <span></span>
  <a class="article-category-link" href="/categories/运维/">运维</a>
</div>



<div class="article-share" id="share">

  <div data-url="http://www.lstop.pub/2016/08/15/Nginx-accesslog分离手机型号信息/" data-title="Nginx accesslog分离手机型号信息 | Zong&#39;s blog" data-tsina="" class="share clearfix">
  </div>

</div>
</footer>   	       
	</article>
	
<nav class="article-nav clearfix">
 
 <div class="prev" >
 <a href="/2016/08/22/MongoDB权限管理/" title="MongoDB权限管理">
  <strong>PREVIOUS:</strong><br/>
  <span>
  MongoDB权限管理</span>
</a>
</div>


<div class="next">
<a href="/2016/08/10/网络运营商代码注入/"  title="网络运营商代码注入">
 <strong>NEXT:</strong><br/> 
 <span>网络运营商代码注入
</span>
</a>
</div>

</nav>

	
</div>  
      <div class="openaside"><a class="navbutton" href="#" title="Show Sidebar"></a></div>

  <div id="toc" class="toc-aside">
  <strong class="toc-title">Contents</strong>
  
  </div>

<div id="asidepart">
<div class="closeaside"><a class="closebutton" href="#" title="Hide Sidebar"></a></div>
<aside class="clearfix">

  
<div class="tagslist">
	<p class="asidetitle">Tags</p>
		<ul class="clearfix">
		
			<li><a href="/tags/Airtest/" title="Airtest">Airtest<sup>1</sup></a></li>
		
			<li><a href="/tags/DNS/" title="DNS">DNS<sup>1</sup></a></li>
		
			<li><a href="/tags/GitLab/" title="GitLab">GitLab<sup>1</sup></a></li>
		
			<li><a href="/tags/K8s/" title="K8s">K8s<sup>8</sup></a></li>
		
			<li><a href="/tags/Linux/" title="Linux">Linux<sup>1</sup></a></li>
		
			<li><a href="/tags/MongoDB/" title="MongoDB">MongoDB<sup>2</sup></a></li>
		
			<li><a href="/tags/OpenWrt/" title="OpenWrt">OpenWrt<sup>1</sup></a></li>
		
			<li><a href="/tags/Python/" title="Python">Python<sup>2</sup></a></li>
		
			<li><a href="/tags/RabbitMQ/" title="RabbitMQ">RabbitMQ<sup>1</sup></a></li>
		
			<li><a href="/tags/calico/" title="calico">calico<sup>1</sup></a></li>
		
			<li><a href="/tags/cdn/" title="cdn">cdn<sup>1</sup></a></li>
		
			<li><a href="/tags/docker/" title="docker">docker<sup>3</sup></a></li>
		
			<li><a href="/tags/docker-registry/" title="docker registry">docker registry<sup>1</sup></a></li>
		
			<li><a href="/tags/elasticsearch/" title="elasticsearch">elasticsearch<sup>3</sup></a></li>
		
			<li><a href="/tags/elk/" title="elk">elk<sup>3</sup></a></li>
		
			<li><a href="/tags/k8s/" title="k8s">k8s<sup>3</sup></a></li>
		
			<li><a href="/tags/kubernetes/" title="kubernetes">kubernetes<sup>1</sup></a></li>
		
			<li><a href="/tags/nginx/" title="nginx">nginx<sup>1</sup></a></li>
		
			<li><a href="/tags/python/" title="python">python<sup>1</sup></a></li>
		
			<li><a href="/tags/tomcat/" title="tomcat">tomcat<sup>1</sup></a></li>
		
		</ul>
</div>


  <div class="linkslist">
  <p class="asidetitle">Links</p>
    <ul>
      <li><a href="http://www.v2ex.com/?r=zong400" target="_blank" title="V2EX">V2EX</a></li>
      <li><a href="http://hexo.io" target="_blank" title="Hexo">Hexo</a></li>
	  <li><a href="https://promotion.aliyun.com/ntms/yunparter/invite.html?userCode=s0bh6uzq" target="_blank" title="阿里云">阿里云</a></li>
	  <li><a href="https://cloud.tencent.com/redirect.php?redirect=1014&cps_key=5bd9deb84d4d9f34b65fb934e12d03e3&from=console" target="_blank" title="腾讯云">腾讯云</a></li>
    </ul>
</div>


</aside>
</div>
    </div>
    <footer><div id="footer" >
	
	
	<div class="social-font" class="clearfix">
		
		
		
		
	</div>
		<p class="copyright">Hosted by <a href="https://pages.coding.me/" target="_blank" title="Coding Pages">Coding Pages</a></p>
		<p class="copyright">Powered by <a href="http://hexo.io" target="_blank" title="hexo">hexo</a> and Theme by <a href="https://github.com/wizicer/iceman" target="_blank" title="Iceman">Iceman</a> © 2020 
		
		<a href="http://www.lstop.pub" target="_blank" title="Zong">Zong</a>
		
		</p>
</div>
</footer>
    <script src="//cdn.staticfile.org/jquery/2.1.0/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){ 
  $('.navbar').click(function(){
    $('header nav').toggleClass('shownav');
  });
  var myWidth = 0;
  function getSize(){
    if( typeof( window.innerWidth ) == 'number' ) {
      myWidth = window.innerWidth;
    } else if( document.documentElement && document.documentElement.clientWidth) {
      myWidth = document.documentElement.clientWidth;
    };
  };
  var m = $('#main'),
      a = $('#asidepart'),
      c = $('.closeaside'),
      o = $('.openaside');
  $(window).resize(function(){
    getSize(); 
    if (myWidth >= 1024) {
      $('header nav').removeClass('shownav');
    }else
    {
      m.removeClass('moveMain');
      a.css('display', 'block').removeClass('fadeOut');
      o.css('display', 'none');
      
      $('#toc.toc-aside').css('display', 'none');
        
    }
  });
  c.click(function(){
    a.addClass('fadeOut').css('display', 'none');
    o.css('display', 'block').addClass('fadeIn');
    m.addClass('moveMain');
  });
  o.click(function(){
    o.css('display', 'none').removeClass('beforeFadeIn');
    a.css('display', 'block').removeClass('fadeOut').addClass('fadeIn');      
    m.removeClass('moveMain');
  });
  $(window).scroll(function(){
    o.css("top",Math.max(80,260-$(this).scrollTop()));
  });
});
</script>

<script type="text/javascript">
$(document).ready(function(){ 
  var ai = $('.article-content>iframe'),
      ae = $('.article-content>embed'),
      t  = $('#toc'),
      h  = $('article h2')
      ah = $('article h2'),
      ta = $('#toc.toc-aside'),
      o  = $('.openaside'),
      c  = $('.closeaside');
  if(ai.length>0){
    ai.wrap('<div class="video-container" />');
  };
  if(ae.length>0){
   ae.wrap('<div class="video-container" />');
  };
  if(ah.length==0){
    t.css('display','none');
  }else{
    c.click(function(){
      ta.css('display', 'block').addClass('fadeIn');
    });
    o.click(function(){
      ta.css('display', 'none');
    });
    $(window).scroll(function(){
      ta.css("top",Math.max(140,320-$(this).scrollTop()));
    });
  };
});
</script>


<script type="text/javascript">
$(document).ready(function(){ 
  var $this = $('.share'),
      url = $this.attr('data-url'),
      encodedUrl = encodeURIComponent(url),
      title = $this.attr('data-title'),
      tsina = $this.attr('data-tsina');
  var html = [
  '<a href="#" class="overlay" id="qrcode"></a>',
  '<div class="qrcode clearfix"><span>扫描二维码分享到微信朋友圈</span><a class="qrclose" href="#share"></a><strong>Loading...Please wait</strong><img id="qrcode-pic" data-src="http://s.jiathis.com/qrcode.php?url=' + encodedUrl + '"/></div>',
  '<a href="#textlogo" class="article-back-to-top" title="Top"></a>',
  '<a href="https://www.facebook.com/sharer.php?u=' + encodedUrl + '" class="article-share-facebook" target="_blank" title="Facebook"></a>',
  '<a href="#qrcode" class="article-share-qrcode" title="QRcode"></a>',
  '<a href="https://twitter.com/intent/tweet?url=' + encodedUrl + '" class="article-share-twitter" target="_blank" title="Twitter"></a>',
  '<a href="http://service.weibo.com/share/share.php?title='+title+'&url='+encodedUrl +'&ralateUid='+ tsina +'&searchPic=true&style=number' +'" class="article-share-weibo" target="_blank" title="Weibo"></a>',
  '<span title="Share to"></span>'
  ].join('');
  $this.append(html);
  $('.article-share-qrcode').click(function(){
    var imgSrc = $('#qrcode-pic').attr('data-src');
    $('#qrcode-pic').attr('src', imgSrc);
    $('#qrcode-pic').load(function(){
        $('.qrcode strong').text(' ');
    });
  });
});     
</script>









  </body>
</html>

